Source sentence simplification for statistical machine translation
نویسندگان
چکیده
منابع مشابه
Sentence Simplification by Monolingual Machine Translation
In this paper we describe a method for simplifying sentences using Phrase Based Machine Translation, augmented with a re-ranking heuristic based on dissimilarity, and trained on a monolingual parallel corpus. We compare our system to a word-substitution baseline and two state-of-the-art systems, all trained and tested on paired sentences from the English part of Wikipedia and Simple Wikipedia. ...
متن کاملOptimizing Statistical Machine Translation for Text Simplification
Most recent sentence simplification systems use basic machine translation models to learn lexical and syntactic paraphrases from a manually simplified parallel corpus. These methods are limited by the quality and quantity of manually simplified corpora, which are expensive to build. In this paper, we conduct an indepth adaptation of statistical machine translation to perform text simplification...
متن کاملTransformation-based Sentence Splitting method for Statistical Machine Translation
We propose a transformation based sentence splitting method for statistical machine translation. Transformations are expanded to improve machine translation quality after automatically obtained from manually split corpus. Through a series of experiments we show that the transformation based sentence splitting is effective pre-processing to long sentence translation.
متن کاملSentence Type Based Reordering Model for Statistical Machine Translation
Many reordering approaches have been proposed for the statistical machine translation (SMT) system. However, the information about the type of source sentence is ignored in the previous works. In this paper, we propose a group of novel reordering models based on the source sentence type for Chinese-toEnglish translation. In our approach, an SVM-based classifier is employed to classify the given...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Computer Speech & Language
سال: 2017
ISSN: 0885-2308
DOI: 10.1016/j.csl.2016.12.001